This data is from the lumber industry, giving the approximate number of board feet of lumber per tree in a forest of a given age. What function will fit the data? Predict the harvest for ages other than those given.
I began this problem by creating a scatter plot of the data in Excel. As the plot shows, the trend is positive and does not appear to be linear:
I then added in the given quadratic trend line. I had tried some others, but this one seems to fit the data best as the line goes through all the data points and the r-squared value is very high. Recall: r-squared is the proportion of the variation in the y-value that is accounted for by the linear relationship of y with x. This means the higher r-squared value, the better the trend line.
To see the other regression lines that were tired, click here.
I will now look at the residuals to see if the regression line is a good choice. A residual is used to find prediction errors in the data. It is the vertical distance between the given point and the regression line. Residuals are equal to the actual value for y minus the predicted value of y. If I have created a good regression line for the data, the residuals should be small and a plot of them should be random. Thus here is my residual plot:
Since my residuals are small and random, I believe I have created a good plot.
Now I can predict the missing 100s board feet values. This is done by using the equation y = 0.011x^2 - 0.6812x + 13.313 and plugging in the given x-value (age of tree).
For example to find the predicted 100s board feet for a 60 year old tree:
y = 0.011(60)2 - 0.6812(60) + 13.313
y = 39.6 – 40.872 + 13.313
y = 12.041
So a 60 year old tree will give about 12,000 board feet.
The end column of the table is then graphed below. This column contains the actual values that were given for 100s of board feet with the missing values filled in with the predicted values.
I believe that this plot looks very good. Also, the quadratic equation that fits this line is very similar to my initial one. Because of this, I will keep my initial quadratic equation to model the data.
Thus my predicted 100s of board lengths for an age of 60 years is 12, for 140 years is 134, and for 180 years is 247.